Overview

Dataset statistics

Number of variables13
Number of observations79853
Missing cells3265
Missing cells (%)0.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.9 MiB
Average record size in memory104.0 B

Variable types

Numeric10
Categorical3

Warnings

Income is highly correlated with premiumHigh correlation
premium is highly correlated with IncomeHigh correlation
Income is highly correlated with premiumHigh correlation
premium is highly correlated with IncomeHigh correlation
application_underwriting_score has 2974 (3.7%) missing values Missing
Income is highly skewed (γ1 = 109.7611714) Skewed
id is uniformly distributed Uniform
id has unique values Unique
perc_premium_paid_by_cash_credit has 5723 (7.2%) zeros Zeros
Count_3-6_months_late has 66801 (83.7%) zeros Zeros
Count_6-12_months_late has 75831 (95.0%) zeros Zeros
Count_more_than_12_months_late has 76038 (95.2%) zeros Zeros

Reproduction

Analysis started2021-10-18 16:31:55.389468
Analysis finished2021-10-18 16:32:48.342666
Duration52.95 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

id
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct79853
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean57167.16637
Minimum2
Maximum114076
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size624.0 KiB
2021-10-18T22:02:48.536691image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile5802.6
Q128640
median57262
Q385632
95-th percentile108370.4
Maximum114076
Range114074
Interquartile range (IQR)56992

Descriptive statistics

Standard deviation32928.97016
Coefficient of variation (CV)0.5760119357
Kurtosis-1.200640812
Mean57167.16637
Median Absolute Deviation (MAD)28494
Skewness-0.004028155358
Sum4564969736
Variance1084317076
MonotonicityNot monotonic
2021-10-18T22:02:48.926579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20471
 
< 0.1%
395581
 
< 0.1%
825571
 
< 0.1%
887021
 
< 0.1%
866551
 
< 0.1%
436481
 
< 0.1%
477461
 
< 0.1%
456991
 
< 0.1%
334131
 
< 0.1%
600401
 
< 0.1%
Other values (79843)79843
> 99.9%
ValueCountFrequency (%)
21
< 0.1%
31
< 0.1%
51
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
131
< 0.1%
141
< 0.1%
151
< 0.1%
191
< 0.1%
ValueCountFrequency (%)
1140761
< 0.1%
1140751
< 0.1%
1140741
< 0.1%
1140731
< 0.1%
1140711
< 0.1%
1140701
< 0.1%
1140691
< 0.1%
1140661
< 0.1%
1140651
< 0.1%
1140631
< 0.1%

perc_premium_paid_by_cash_credit
Real number (ℝ≥0)

ZEROS

Distinct1001
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3142877412
Minimum0
Maximum1
Zeros5723
Zeros (%)7.2%
Negative0
Negative (%)0.0%
Memory size624.0 KiB
2021-10-18T22:02:49.601825image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.034
median0.167
Q30.538
95-th percentile1
Maximum1
Range1
Interquartile range (IQR)0.504

Descriptive statistics

Standard deviation0.3349145654
Coefficient of variation (CV)1.065630381
Kurtosis-0.6144351478
Mean0.3142877412
Median Absolute Deviation (MAD)0.158
Skewness0.8929224698
Sum25096.819
Variance0.1121677661
MonotonicityNot monotonic
2021-10-18T22:02:49.917802image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
05723
 
7.2%
15004
 
6.3%
0.001698
 
0.9%
0.002612
 
0.8%
0.003551
 
0.7%
0.004541
 
0.7%
0.006541
 
0.7%
0.005530
 
0.7%
0.01496
 
0.6%
0.008488
 
0.6%
Other values (991)64669
81.0%
ValueCountFrequency (%)
05723
7.2%
0.001698
 
0.9%
0.002612
 
0.8%
0.003551
 
0.7%
0.004541
 
0.7%
0.005530
 
0.7%
0.006541
 
0.7%
0.007463
 
0.6%
0.008488
 
0.6%
0.009473
 
0.6%
ValueCountFrequency (%)
15004
6.3%
0.99943
 
0.1%
0.99838
 
< 0.1%
0.99752
 
0.1%
0.99651
 
0.1%
0.99546
 
0.1%
0.99449
 
0.1%
0.99343
 
0.1%
0.99250
 
0.1%
0.99142
 
0.1%

age_in_days
Real number (ℝ≥0)

Distinct833
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18846.69691
Minimum7670
Maximum37602
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size624.0 KiB
2021-10-18T22:02:50.403164image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum7670
5-th percentile10596
Q114974
median18625
Q322636
95-th percentile27754
Maximum37602
Range29932
Interquartile range (IQR)7662

Descriptive statistics

Standard deviation5208.719136
Coefficient of variation (CV)0.2763730516
Kurtosis-0.4597678551
Mean18846.69691
Median Absolute Deviation (MAD)3654
Skewness0.2225719582
Sum1504965288
Variance27130755.04
MonotonicityNot monotonic
2021-10-18T22:02:50.833915image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17899226
 
0.3%
16802223
 
0.3%
17895219
 
0.3%
17163217
 
0.3%
16801216
 
0.3%
19350213
 
0.3%
18261212
 
0.3%
17894211
 
0.3%
19358210
 
0.3%
17528209
 
0.3%
Other values (823)77697
97.3%
ValueCountFrequency (%)
76703
< 0.1%
76713
< 0.1%
76727
< 0.1%
76733
< 0.1%
76746
< 0.1%
76755
< 0.1%
76767
< 0.1%
76777
< 0.1%
76785
< 0.1%
76797
< 0.1%
ValueCountFrequency (%)
376021
< 0.1%
372401
< 0.1%
372391
< 0.1%
368741
< 0.1%
368701
< 0.1%
361501
< 0.1%
361452
< 0.1%
357841
< 0.1%
354191
< 0.1%
354171
< 0.1%

Income
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct24165
Distinct (%)30.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean208847.1712
Minimum24030
Maximum90262600
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size624.0 KiB
2021-10-18T22:02:51.144934image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum24030
5-th percentile54030
Q1108010
median166560
Q3252090
95-th percentile450050
Maximum90262600
Range90238570
Interquartile range (IQR)144080

Descriptive statistics

Standard deviation496582.5973
Coefficient of variation (CV)2.377731977
Kurtosis16639.12845
Mean208847.1712
Median Absolute Deviation (MAD)68250
Skewness109.7611714
Sum1.667707316 × 1010
Variance2.465942759 × 1011
MonotonicityNot monotonic
2021-10-18T22:02:51.555981image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
150130171
 
0.2%
150060162
 
0.2%
150090158
 
0.2%
150120157
 
0.2%
150050154
 
0.2%
150100152
 
0.2%
150070149
 
0.2%
150110142
 
0.2%
150030136
 
0.2%
150080135
 
0.2%
Other values (24155)78337
98.1%
ValueCountFrequency (%)
2403013
< 0.1%
2404010
< 0.1%
2405011
< 0.1%
240602
 
< 0.1%
240703
 
< 0.1%
2408010
< 0.1%
240908
< 0.1%
241008
< 0.1%
2411013
< 0.1%
2412013
< 0.1%
ValueCountFrequency (%)
902626001
< 0.1%
538219001
< 0.1%
468031401
< 0.1%
321750901
< 0.1%
250512401
< 0.1%
210751301
< 0.1%
209860301
< 0.1%
174712101
< 0.1%
168740101
< 0.1%
128475601
< 0.1%

Count_3-6_months_late
Real number (ℝ≥0)

ZEROS

Distinct14
Distinct (%)< 0.1%
Missing97
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean0.2486709464
Minimum0
Maximum13
Zeros66801
Zeros (%)83.7%
Negative0
Negative (%)0.0%
Memory size624.0 KiB
2021-10-18T22:02:51.892927image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum13
Range13
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.6914680986
Coefficient of variation (CV)2.780654953
Kurtosis24.90942403
Mean0.2486709464
Median Absolute Deviation (MAD)0
Skewness4.150115635
Sum19833
Variance0.4781281314
MonotonicityNot monotonic
2021-10-18T22:02:52.204427image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
066801
83.7%
18826
 
11.1%
22519
 
3.2%
3954
 
1.2%
4374
 
0.5%
5168
 
0.2%
668
 
0.1%
723
 
< 0.1%
815
 
< 0.1%
94
 
< 0.1%
Other values (4)4
 
< 0.1%
(Missing)97
 
0.1%
ValueCountFrequency (%)
066801
83.7%
18826
 
11.1%
22519
 
3.2%
3954
 
1.2%
4374
 
0.5%
5168
 
0.2%
668
 
0.1%
723
 
< 0.1%
815
 
< 0.1%
94
 
< 0.1%
ValueCountFrequency (%)
131
 
< 0.1%
121
 
< 0.1%
111
 
< 0.1%
101
 
< 0.1%
94
 
< 0.1%
815
 
< 0.1%
723
 
< 0.1%
668
 
0.1%
5168
0.2%
4374
0.5%

Count_6-12_months_late
Real number (ℝ≥0)

ZEROS

Distinct17
Distinct (%)< 0.1%
Missing97
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean0.07818847485
Minimum0
Maximum17
Zeros75831
Zeros (%)95.0%
Negative0
Negative (%)0.0%
Memory size624.0 KiB
2021-10-18T22:02:52.493791image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum17
Range17
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4365074263
Coefficient of variation (CV)5.582759188
Kurtosis185.0243945
Mean0.07818847485
Median Absolute Deviation (MAD)0
Skewness10.35329931
Sum6236
Variance0.1905387333
MonotonicityNot monotonic
2021-10-18T22:02:52.751401image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
075831
95.0%
12680
 
3.4%
2693
 
0.9%
3317
 
0.4%
4130
 
0.2%
546
 
0.1%
626
 
< 0.1%
711
 
< 0.1%
85
 
< 0.1%
104
 
< 0.1%
Other values (7)13
 
< 0.1%
(Missing)97
 
0.1%
ValueCountFrequency (%)
075831
95.0%
12680
 
3.4%
2693
 
0.9%
3317
 
0.4%
4130
 
0.2%
546
 
0.1%
626
 
< 0.1%
711
 
< 0.1%
85
 
< 0.1%
94
 
< 0.1%
ValueCountFrequency (%)
171
 
< 0.1%
151
 
< 0.1%
142
 
< 0.1%
132
 
< 0.1%
121
 
< 0.1%
112
 
< 0.1%
104
 
< 0.1%
94
 
< 0.1%
85
< 0.1%
711
< 0.1%

Count_more_than_12_months_late
Real number (ℝ≥0)

ZEROS

Distinct10
Distinct (%)< 0.1%
Missing97
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean0.06000802447
Minimum0
Maximum11
Zeros76038
Zeros (%)95.2%
Negative0
Negative (%)0.0%
Memory size624.0 KiB
2021-10-18T22:02:53.040197image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum11
Range11
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.3120227225
Coefficient of variation (CV)5.199683296
Kurtosis98.44852981
Mean0.06000802447
Median Absolute Deviation (MAD)0
Skewness7.850187565
Sum4786
Variance0.09735817936
MonotonicityNot monotonic
2021-10-18T22:02:53.277285image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
076038
95.2%
12996
 
3.8%
2498
 
0.6%
3151
 
0.2%
448
 
0.1%
513
 
< 0.1%
66
 
< 0.1%
73
 
< 0.1%
82
 
< 0.1%
111
 
< 0.1%
(Missing)97
 
0.1%
ValueCountFrequency (%)
076038
95.2%
12996
 
3.8%
2498
 
0.6%
3151
 
0.2%
448
 
0.1%
513
 
< 0.1%
66
 
< 0.1%
73
 
< 0.1%
82
 
< 0.1%
111
 
< 0.1%
ValueCountFrequency (%)
111
 
< 0.1%
82
 
< 0.1%
73
 
< 0.1%
66
 
< 0.1%
513
 
< 0.1%
448
 
0.1%
3151
 
0.2%
2498
 
0.6%
12996
 
3.8%
076038
95.2%

application_underwriting_score
Real number (ℝ≥0)

MISSING

Distinct672
Distinct (%)0.9%
Missing2974
Missing (%)3.7%
Infinite0
Infinite (%)0.0%
Mean99.0672912
Minimum91.9
Maximum99.89
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size624.0 KiB
2021-10-18T22:02:53.596893image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum91.9
5-th percentile97.81
Q198.81
median99.21
Q399.54
95-th percentile99.87
Maximum99.89
Range7.99
Interquartile range (IQR)0.73

Descriptive statistics

Standard deviation0.7397990154
Coefficient of variation (CV)0.007467641504
Kurtosis13.98790359
Mean99.0672912
Median Absolute Deviation (MAD)0.36
Skewness-2.756097905
Sum7616194.28
Variance0.5473025832
MonotonicityNot monotonic
2021-10-18T22:02:53.926727image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99.891972
 
2.5%
99.881343
 
1.7%
99.87856
 
1.1%
99.86693
 
0.9%
99.3642
 
0.8%
99.38633
 
0.8%
99.31623
 
0.8%
99.28617
 
0.8%
99.37616
 
0.8%
99.23614
 
0.8%
Other values (662)68270
85.5%
(Missing)2974
 
3.7%
ValueCountFrequency (%)
91.91
 
< 0.1%
91.963
< 0.1%
91.981
 
< 0.1%
92.032
< 0.1%
92.071
 
< 0.1%
92.132
< 0.1%
92.151
 
< 0.1%
92.161
 
< 0.1%
92.171
 
< 0.1%
92.21
 
< 0.1%
ValueCountFrequency (%)
99.891972
2.5%
99.881343
1.7%
99.87856
1.1%
99.86693
 
0.9%
99.85550
 
0.7%
99.84463
 
0.6%
99.83413
 
0.5%
99.82358
 
0.4%
99.81372
 
0.5%
99.8379
 
0.5%

no_of_premiums_paid
Real number (ℝ≥0)

Distinct57
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.86388739
Minimum2
Maximum60
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size624.0 KiB
2021-10-18T22:02:54.338571image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile4
Q17
median10
Q314
95-th percentile20
Maximum60
Range58
Interquartile range (IQR)7

Descriptive statistics

Standard deviation5.170687238
Coefficient of variation (CV)0.475951844
Kurtosis3.325117066
Mean10.86388739
Median Absolute Deviation (MAD)3
Skewness1.226636685
Sum867514
Variance26.73600651
MonotonicityNot monotonic
2021-10-18T22:02:54.668649image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
87184
 
9.0%
97158
 
9.0%
106873
 
8.6%
76623
 
8.3%
116395
 
8.0%
65635
 
7.1%
125407
 
6.8%
134752
 
6.0%
54215
 
5.3%
143988
 
5.0%
Other values (47)21623
27.1%
ValueCountFrequency (%)
2726
 
0.9%
31746
 
2.2%
42907
3.6%
54215
5.3%
65635
7.1%
76623
8.3%
87184
9.0%
97158
9.0%
106873
8.6%
116395
8.0%
ValueCountFrequency (%)
601
 
< 0.1%
591
 
< 0.1%
582
< 0.1%
563
< 0.1%
551
 
< 0.1%
542
< 0.1%
532
< 0.1%
522
< 0.1%
513
< 0.1%
503
< 0.1%

sourcing_channel
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size624.0 KiB
A
43134 
B
16512 
C
12039 
D
7559 
E
 
609

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters79853
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC
2nd rowA
3rd rowC
4th rowA
5th rowB

Common Values

ValueCountFrequency (%)
A43134
54.0%
B16512
 
20.7%
C12039
 
15.1%
D7559
 
9.5%
E609
 
0.8%

Length

2021-10-18T22:02:55.328084image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-10-18T22:02:55.535452image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
a43134
54.0%
b16512
 
20.7%
c12039
 
15.1%
d7559
 
9.5%
e609
 
0.8%

Most occurring characters

ValueCountFrequency (%)
A43134
54.0%
B16512
 
20.7%
C12039
 
15.1%
D7559
 
9.5%
E609
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter79853
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A43134
54.0%
B16512
 
20.7%
C12039
 
15.1%
D7559
 
9.5%
E609
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Latin79853
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A43134
54.0%
B16512
 
20.7%
C12039
 
15.1%
D7559
 
9.5%
E609
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII79853
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A43134
54.0%
B16512
 
20.7%
C12039
 
15.1%
D7559
 
9.5%
E609
 
0.8%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size624.0 KiB
Urban
48183 
Rural
31670 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters399265
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUrban
2nd rowUrban
3rd rowRural
4th rowUrban
5th rowUrban

Common Values

ValueCountFrequency (%)
Urban48183
60.3%
Rural31670
39.7%

Length

2021-10-18T22:02:56.114322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-10-18T22:02:56.320711image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
urban48183
60.3%
rural31670
39.7%

Most occurring characters

ValueCountFrequency (%)
r79853
20.0%
a79853
20.0%
U48183
12.1%
b48183
12.1%
n48183
12.1%
R31670
 
7.9%
u31670
 
7.9%
l31670
 
7.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter319412
80.0%
Uppercase Letter79853
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r79853
25.0%
a79853
25.0%
b48183
15.1%
n48183
15.1%
u31670
 
9.9%
l31670
 
9.9%
Uppercase Letter
ValueCountFrequency (%)
U48183
60.3%
R31670
39.7%

Most occurring scripts

ValueCountFrequency (%)
Latin399265
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r79853
20.0%
a79853
20.0%
U48183
12.1%
b48183
12.1%
n48183
12.1%
R31670
 
7.9%
u31670
 
7.9%
l31670
 
7.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII399265
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r79853
20.0%
a79853
20.0%
U48183
12.1%
b48183
12.1%
n48183
12.1%
R31670
 
7.9%
u31670
 
7.9%
l31670
 
7.9%

premium
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct30
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10924.50753
Minimum1200
Maximum60000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size624.0 KiB
2021-10-18T22:02:56.516718image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1200
5-th percentile1200
Q15400
median7500
Q313800
95-th percentile28500
Maximum60000
Range58800
Interquartile range (IQR)8400

Descriptive statistics

Standard deviation9401.676542
Coefficient of variation (CV)0.8606041521
Kurtosis6.67178945
Mean10924.50753
Median Absolute Deviation (MAD)4200
Skewness2.19817831
Sum872354700
Variance88391521.8
MonotonicityNot monotonic
2021-10-18T22:02:56.785063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
540010932
13.7%
750010194
12.8%
33009857
12.3%
96008419
10.5%
12006900
8.6%
117006832
8.6%
138006548
8.2%
57003071
 
3.8%
180003062
 
3.8%
159002831
 
3.5%
Other values (20)11207
14.0%
ValueCountFrequency (%)
12006900
8.6%
33009857
12.3%
540010932
13.7%
57003071
 
3.8%
750010194
12.8%
96008419
10.5%
117006832
8.6%
138006548
8.2%
159002831
 
3.5%
180003062
 
3.8%
ValueCountFrequency (%)
60000419
0.5%
57900187
0.2%
5580052
 
0.1%
5370071
 
0.1%
5160081
 
0.1%
49500100
 
0.1%
47400134
 
0.2%
45300141
 
0.2%
43200158
 
0.2%
41100207
0.3%

target
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size624.0 KiB
1
74855 
0
 
4998

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters79853
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row1
5th row1

Common Values

ValueCountFrequency (%)
174855
93.7%
04998
 
6.3%

Length

2021-10-18T22:02:57.361446image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-10-18T22:02:57.654920image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
174855
93.7%
04998
 
6.3%

Most occurring characters

ValueCountFrequency (%)
174855
93.7%
04998
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number79853
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
174855
93.7%
04998
 
6.3%

Most occurring scripts

ValueCountFrequency (%)
Common79853
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
174855
93.7%
04998
 
6.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII79853
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
174855
93.7%
04998
 
6.3%

Interactions

2021-10-18T22:02:06.164165image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:06.542151image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:06.894547image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:07.261103image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:07.601874image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:07.957616image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:08.454800image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:08.960149image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:09.452870image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:10.079267image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:10.510886image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:10.982517image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:11.430670image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:11.876734image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:12.296952image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:12.677910image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:13.079682image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:13.467303image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:13.907186image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:14.288377image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:14.659212image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:15.029624image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:15.399759image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:15.759286image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:16.108448image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:16.468085image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:16.856691image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:17.215953image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:17.597497image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:17.960609image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:18.331770image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:18.672063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:19.020850image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:19.358715image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:19.689138image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:20.038179image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:20.387627image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:20.862987image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:21.263563image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:21.614401image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:21.975290image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:22.339187image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:22.698337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:23.050611image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:23.390181image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:23.771748image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:24.157727image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:24.524323image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:24.923501image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:25.303384image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:25.698459image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:26.079776image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:26.475077image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:26.872344image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:27.243593image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:27.616407image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:28.016246image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:28.389314image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:28.815736image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:29.217826image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:29.610018image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:29.964844image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:30.324759image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:30.675645image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:31.024017image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:31.392312image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:31.772002image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:32.136298image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:32.485478image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:32.786765image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:33.227957image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:33.788226image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:34.177727image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:34.583859image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:34.875173image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:35.247839image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:35.707564image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:36.230714image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:36.646618image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:37.055612image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:37.495802image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:38.026614image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:38.397493image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:38.768352image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:39.117867image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:39.483499image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:39.721471image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:40.106609image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:40.496855image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:40.929490image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:41.337579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:41.685588image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:42.035928image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:42.412046image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:42.903478image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:43.321586image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:43.670197image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:44.008846image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:44.479340image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-18T22:02:44.949545image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-10-18T22:02:57.887578image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-10-18T22:02:58.421158image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-10-18T22:02:59.004010image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-10-18T22:02:59.629442image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-10-18T22:03:00.210043image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-10-18T22:02:45.490780image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-10-18T22:02:46.481512image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-10-18T22:02:47.327688image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-10-18T22:02:47.806322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

idperc_premium_paid_by_cash_creditage_in_daysIncomeCount_3-6_months_lateCount_6-12_months_lateCount_more_than_12_months_lateapplication_underwriting_scoreno_of_premiums_paidsourcing_channelresidence_area_typepremiumtarget
01109360.429120583550600.00.00.099.0213CUrban33001
1414920.010215463151500.00.00.099.8921AUrban180001
2313000.91717531841402.03.01.098.697CRural33000
3194150.049153412505100.00.00.099.579AUrban96001
4993790.052314001986800.00.00.099.8712BUrban96001
5599510.540175272820802.00.00.099.189BRural222001
6540311.000248291184000.00.00.099.0511BUrban75001
7942901.000219111802401.06.04.099.333AUrban96000
8937300.6219868925200.00.00.099.584AUrban75001
9848440.908230081071802.00.00.098.9111ARural54000

Last rows

idperc_premium_paid_by_cash_creditage_in_daysIncomeCount_3-6_months_lateCount_6-12_months_lateCount_more_than_12_months_lateapplication_underwriting_scoreno_of_premiums_paidsourcing_channelresidence_area_typepremiumtarget
79843846280.45425928708000.00.00.098.197AUrban57001
79844112620.994204451868300.00.00.099.675AUrban180001
79845253660.825149793600602.00.00.098.6110DUrban243001
798461047050.118222751950700.00.00.099.2511AUrban96001
79847910810.033182653015400.00.00.099.894ARural138001
79848480570.425233672245501.00.00.098.7019BUrban138001
79849590120.704193562791501.00.01.099.4212ARural285001
79850770500.000233723050200.00.00.098.8912ARural96001
79851672250.39822641393300.00.00.098.688ARural57001
79852715310.550157092801401.00.01.099.848AUrban96000